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f— Single molecule experiments on single- and double stranded DNA have sparked a 

Oh' renewed interest in the force-extension of polymers. The extensible Freely Jointed 

. 2 ' Chain (FJC) model is frequently invoked to explain the observed behavior of single- 

x> ; 

c/3 , stranded DNA. We demonstrate that this model does not satisfactorily describe 

U ■ 

' ^2 ' recent high-force stretching data. We instead propose a model (the Discrete Persis- 

tent Chain, or "DPC" ) that borrows features from both the FJC and the Wormlike 
Chain, and show that it resembles the data more closely. We find that most of 
the high-force behavior previously attributed to stretch elasticity is really a feature 

00 ' °f the corrected entropic elasticity; the true stretch compliance of single-stranded 

o : 

\& ■ DNA is several times smaller than that found by previous authors. Next we elab- 

O ; 

orate our model to allow coexistence of two conformational states of DNA, each 

o . 

c/3 ' with its own stretch and bend elastic constants. Our model is computationally 

o : 

c/3 ■ simple, and gives an excellent fit through the entire overstretching transition of 

i nicked, double-stranded DNA. The fit gives the first values for the elastic constants 

£* 

of the stretched state. In particular we find the effective bend stiffness for DNA in 

•i-H , 

' this state to be about 10 nm • k^T, a value quite different from either B-form or 

h ; 

single-stranded DNA. 
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I. INTRODUCTION AND SUMMARY 



New single-molecule manipulation techniques have opened the mechanical properties of 
individual macromolecules to much more direct study than ever before. For example, optical- 
trap measurements give the force-extension relation of a single molecule of lambda DNA, 
from which we can deduce the molecule's average elastic properties by fitting to a model. 
Part of the beauty of this procedure is that we pass from an optical-scale measurement (the 
total end-to-end length of the DNA is typically over 10 jim) to a microscopic conclusion (the 
elastic constants of the 2nm diameter DNA molecule). But by the same token, we must 
be careful with the interpretation of our results. Fitting a physically inappropriate model 
to data can give reasonable-looking fits, but yield values of the fit parameters that are not 
microscopically meaningful. 

We will illustrate the above remarks by studying high-force measurements of the force- 
extension relation for single-stranded DNA. Previous authors have fit this relation at low 
to moderate forces to the Extensible Freely Jointed Chain (EFJC) model, obtaining as fit 
parameters a link length and an extension modulus for increasing the contour length of the 
chain. We argue that to capture the microscopic physics, at least one additional element 
must be added to the model, namely a link stiffness. The resulting model fits the data better 
than either the EFJC or the Extensible Worm-Like Chain (EWLC) models. The fit also 
yields a much large value of the extension modulus than previously reported. The reason 
for this discrepancy is that high-force effects previously attributed to intrinsic stretching of 
the chain are, in our model, simply a part of the correct entropic elasticity. 

The mathematical formalism we introduce to solve our model is of some independent 
interest, being simpler than some earlier approaches. In particular, it is quite easy to extend 
our model to study a linear chain consisting of two different, coexisting conformations of the 
polymer, each with its own elastic constants. We formulate and solve this model as well. 
The model makes no assumptions about the elastic properties of the two states, but rather 
deduces them by fitting to recent data on the overstretching transition in nicked, double- 
stranded DNA. Besides giving a very good fit to the data, our model yields insight into the 
character of the stretched conformation of DNA. The model is flexible and can readily be 
adapted to the study of the stretching of polypeptides with a helix-coil transition. 
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II. THE WORM-LIKE CHAIN AND THE FREELY JOINTED CHAIN 



A. The Freely Jointed Chain 

A polymer is a long, linear, single molecule. The chemical bonds defining the molecule 
can be more or less flexible in different cases. The simplest model of polymer conformation 
treats the molecule as a chain of rigid subunits, joined by perfectly flexible hinges — a "freely 
jointed chain," or FJC (Flory, 1969). The FJC model is not very appropriate to double- 
stranded DNA, consisting of a stack of flat basepairs joined by both covalent bonds and 
physical interactions (hydrogen bonds and the hydrophobic base-stacking energy), but for 
single-stranded DNA (ssDNA) it forms an attractive starting point. 

Deviations from the FJC picture can come from a variety of interactions among the 
individual monomers: Individual covalent bonds may have bending energies that are not 
small relative to k B T, successive monomers may have steric interactions, and so on. To 
some extent we can compensate for the model's omission of such interactions by choosing 
an effective link length b that is longer than the actual monomer size. Since the FJC views 
the polymer as a chain of perfectly stiff links, choosing a larger b gives us a chain of longer 
links and thus effectively stiffens the chain. Accordingly, one views b as a fit parameter when 
deriving the force-extension relation of the model. The fit value of b can then depend both 
on the molecule under study, and on its external conditions like salt concentration, as those 
conditions affect the intramolecular interactions. 

To formulate the FJC we describe a molecular conformation by associating with each 
segment a unit orientation vector £j, pointing in the direction of the ith segment, as sketched 
in Fig. 1. In the presence of an external force / along the z direction, we can define an energy 
functional for the chain 



In the absence of an external force, all configurations have equal energy and (neglecting 
self-avoidance) the chain displays the characteristics of a random walk. To pull the ends of 
such a chain away from each other a force has to be applied, as extending the chain reduces 
its conformational entropy. The resulting entropic elastic behavior can be summarized in 




(1) 
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the force- extension relation (Grosberg and Khlokhlov, 1994) 

( = coth( — - - — - , (2) 

Uot KB 1 JO 

the well-known Langevin function. In the limit of low stretching force, all polymer models 
reduce to Hooke-law behavior / = k sp (z); we define the effective spring constant by k — 
k sp ■ L tot , or 

<t^>-- + W 2 )- (3) 

Mot K 

Expanding Eq. 2 gives the effective spring constant for the FJC as k fjc = ^sL. The fact 
that the effective spring constant is proportional to the absolute temperature illustrates that 
the elasticity in this model is purely entropic in nature. 

At high stretching force, Eq. 2 gives (jf—) — *• 1; the extension saturates when all the 
links of the chain are aligned by the external force. In reality, individual links are slightly 
extensible; we will modify the model to introduce this effect in Sect. II. C. 



B. The Wormlike Chain 

As mentioned above, double-stranded DNA (dsDNA) is far from being a freely jointed 
chain. Thus it is unsurprising that while the FJC model can reproduce the observed linear 
force-extension relation of dsDNA at low stretching force, and the observed saturation at 
high force, still it fails at intermediate values of /. Another indication that the model is 
physically inappropriate is that the best-fit value of the link length is 6 ~ 100 nm, completely 
different from the physical contour length per basepair of 0.34 nm. 

To improve upon the FJC, we must account for the fact that the monomers do resist 
bending. In fact, the very great stiffness of double-stranded DNA can be turned to our 
advantage, as it implies that successive monomers are constrained to point in nearly the 
same direction. Thus we can treat the polymer as a continuum elastic body, its configuration 
described by the position f(s) as a function of the relaxed-state contour length s (see Fig. 2). 
Continuing to treat the chain as inextensible gives the Worm Like Chain (Kratky and Porod, 
1949; Saito et al, 1967). The local tangent and curvature vectors (t and w, respectively) 
are given by 



We temporarily assume that the chain is inextensible, expressed locally by the condition 
that \t(s)\ = 1 everywhere. 

To get an energy functional generalizing Eq. 1, we note that for a thin, homogeneous 
rod the energy density of elastic strain is proportional to the square of the local curvature. 
Adding the external-force term from Eq. 1 yields 



WLCr+7„M rL 



£ WLC [t(s )] 



Hot. 




di( s ) 



ds 
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Eq. 5 makes it clear that the parameter A is a measure of the bend stiffness of the chain. 
A is also the persistence length of the chain, the characteristic length scale associated with 
the decay of tangent-tangent correlations at zero stretching force: 

(*(0) • t») W LC ~ e~^ A . (6) 

The force-extension relation for the WLC was obtained numerically in (Marko and Siggia, 
1995); subsequently a high-precision interpolation formula was given in (Bouchiat et al., 
1999). At low force, the WLC also behaves like an ideal spring, with effective spring constant 
(Yamakawa, 1971) 

wlc _ 3k B T 

~^A~- {7} 

Thus a WLC with stiffness parameter A yields a force-extension relation that at low force 
matches the FJC with b = 2A. 

The remarks at the start of this subsection make it clear that the WLC is just an ap- 
proximation, valid in the limit where the persistence length A is much longer than the 
physical monomer length (and width). When these conditions are not met, the picture of 
the molecule as a thin, continuous, elastic body will not be accurate; short-length cutoff 
effects will then enter in an essential way. 



C. Experiments 

Early single-molecule stretching experiments showed that double-stranded DNA closely 
follows the predicted force-extension of the WLC at forces under 10 pN (Bustamante et al, 
1994). Later experiments probing the region 10 pN < f < 60 pN found a linear deviation 
from the WLC prediction, attributable to a Hooke-law stretching elasticity (Cluzel et al, 
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1996; Smith et al, 1996; Wang et al, 1997). Adding this effect into the model introduces 
a second fit parameter E in addition to A. To lowest order in f/E this modification just 
amounts to multiplying the model's (7^7) by the factor (1 + ^); for dsDNA the resulting fit 
is very good out to 60 pN. 

The situation for single-stranded DNA has been less clear. Adding an extensibility factor 
to Eq. 2 again yields a model with two parameters {b and E). Though this "extensible 
FJC" (EFJC) model yielded impressive fits to the early experimental data, recent advances 
in single-molecule manipulation (Clausen-Schaumann et al, 2000; Rief et al, 1999) have 
again probed higher forces, and here the agreement is not so good. As shown in Fig. 5, the 
previously cited values for b and E do not give a successful extrapolation to the regime of 
higher forces. In the following section, we will propose a new model that borrows features 
from both the FJC and the WLC to describe these data more accurately. 



III. THE DISCRETE PERSISTENT CHAIN 



The previous sections have made it clear that a real polymer will display both discreteness 
and bend stiffness. While we have seen that the corresponding effects on the force-extension 
relation are interchangeable at very low forces, higher forces will distinguish them. Accord- 
ingly we now formulate a model with both b and A; later we will add a stretch stiffness as 
well. 

Our "Discrete Persistent Chain" (or DPC) models the polymer as a chain composed of N 
segments of length b, whose conformation is once again fully described by the collection of 
orientation vectors (see Fig. 3). Bend resistance is taken into account by including an 
energy penalty at each link proportional to the square of the angle (©^+1 = arccostj • 
between two subsequent links. The energy functional describing this model is thus given by 

k B T =-2.^T^+2^(°M+i) • (8) 

i=i i=i 



The partition function for this energy functional is then given by 



N 



n / dH 

T=i J ^ 2 



;n-i 



JJ e -£(ti,ti+i)/kBT I e -afer*ivS ^ ^ 



where 



£i(ii,i i+1 ) fb s s A , 



k B T 2k B T y 2b 



(U + t i+1 )-z +-(9 M+1 ) 2 (10) 
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and § 2 is the two-dimensional unit sphere. 

To compute Z we interpret each integral in Eq. 9 as a generalized matrix product (among 
matrices with continuous indices), writing (Kramers and Wannier, 1941) 

Z = v-J N ~ 1 -w. (11) 

In this formula v and w are vectors indexed by t, or in other words functions v(t),w(t). 
The matrix product T • v is a new vector, defined by the convolution: 

(T- / dHjTitJjMtj). (12) 
The matrix elements of T are given by 

T(£j, ij) = e - £i{ii ' i i )/kBT ; (13) 

we will not need the explicit forms of v and «; below. 

The force-extension relation can be obtained from Z by differentiating with respect to 
the force (see Eqs. 9-10): 

(14) 

It is here that the transfer matrix formulation can be used to greatly simplify the calculation 
of the force-extension relation, since all that is needed to compute the logarithmic derivative 
of Z in the limit of long chains is the largest eigenvalue of T, which we will call A max : 



'L tot \L tot J df \ b J df 

We will approximate A max using a variational scheme. Following the line of argument 
of (Marko and Siggia, 1995), we note that the leading eigenfunction of T will reflect the 
physics of the problem in the sense that it must be azimuthally symmetric and peaked in 
the direction of the applied force. A suitable 1-parameter family of trial eigenfunctions v u 
can therefore be defined by 

v u (t) = e ui -*. (16) 
Under (12), the v w have squared norms 

||C|| 2 = ^sinh(2^), (17) 



7 



which allows us to approximate A max variationally by 

— * T -* 

max = maxy(^) = max — . (18) 

w u \\ v oj\\ 

To get some idea of the quality of this variational approach, we can compare its results in 
the limit 6^0 (the WLC) to the exact solution of that model. Fig. 4 plots the difference of 
these force-extension curves, and shows that the results from the variational approximation 
are nowhere off by more than 1%. 

Returning to the full DPC model, Appendix A shows that it is possible to express y(u) 
in terms of the dimensionless variables 

l =T < 19 > 

as a combination of error functions as follows 

, , 2^ir 3 / 2 ue- 2e Resell 2c; 

y(u) = r=~ ~ — x 



x 



Erf I — W/ + U + 2uj) 1 - Erf I — ^=(/ - 4£ + 2cu) 



(20) 



2V2£ 

This formula is only valid in the parameter regime where u* (the locus of the maximum of 
Eq. 20) obeys 

<s>i-\f. (2i) 

For practical purposes this is the region where the magnitude of the bend stiffness A is 
larger than, or at most comparable to, the link length b, which is the physically relevant 
regime. We maximize Eq. 20 numerically to obtain A max , from which we can then compute 
the force-extension relation by numerical differentiation with respect to the force. In the 
small force limit, we can do a little better based on the observation that for small /, u* 
is also small. Expanding Eq. 20 to second order in oo and / we can analytically solve the 
stationarity condition = (which is now simply a quadratic equation) and determine the 
small force entropic elastic behavior of our DPC model to be 

<7^> ^ Jrc + W 2 ) > ( 22 ) 

Mot K 
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where the effective spring constant for the DPC model is given by 1 

It is sometimes convenient to reexpress the parameters A and b of the DPC model in terms 
of k dpc and b. We do this using Eq. 23: 

It is straightforward to add an intrinsic stretch modulus to the calculation outlined above, 
obtaining the "Extensible DPC" (or EDPC) model. We have computed the resulting force- 
extension curves, and fitted to recent data for ssDNA. The results of these fits are collected 
in Fig. 6. Fitting to the data points with / < 400 pN yields a value of the stretch modulus 
of around E w 4500 pN, more than four times larger than even the largest of the previous 
estimates (Clausen-Schaumann et al, 2000; Hegner et al, 1999; Rief et al, 1999). We 
interpret this discrepancy by noting that if we hold k constant while varying b, the difference 
between the EFJC and EDPC models shows up in the high-force regime, which is also 
sensitive to the choice of E. Thus neglecting cutoff effects causes curve fitting to make a 
compensating change in E. 

The best fit (in terms of x 2 ) is obtained for a value of b xs 0.17nm, away from both 
the EWLC (6 = 0) and EFJC {b = ^al = 1.7 nm) limits of the model. Even though to 
the eye the difference between the three models in the fit region might appear marginal, 
the improvement in \ 2 achieved by the DPC at just over 18% is statistically relevant. 
Interestingly, the fit value of b is indeed comparable to the physical segment length of ssDNA 
(0.6 nm), a result not put in by hand. Fig. 6 also shows that our EDPC model extrapolates 
better to the high-force regime than either the EFJC or the EWLC. 

Previous authors have already noted that the extensible FJC model does not accurately 
model the high-force data (Clausen-Schaumann et al, 2000; Rief et al, 1999), but have 
attributed its failure to the onset of nonlinear elasticity effects. We may expect such effects 



1 Eq. 23 has the expected property that k dpc — > k wlc when we send 6^0 with A fixed. The opposite 
limit, where A goes to holding b fixed, should recover the FJC, but instead Eq. 23 gives an unphysical, 
negative value of k dpc . However, this limit takes us outside the domain of validity Eq. 21, and we cannot 
use Eq. 23 any more. We have verified numerically that the DPC model does reduce to the FJC in that 
particular limit. 
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to become significant when the ratio f/E exceeds, say, 10%. Our large fit value of E means 
that we ought to be able to trust our linear-elasticity model out to around / = 400 pN, 
which is why we used only the data up to this point in our fit. Indeed Fig. 6 shows that 
the extensible DPC model works well out to f — 400 pN. Carrying the fit out to still larger 
values of / would raise the fit value of E still further. 

IV. THE OVERSTRETCHING TRANSITION 
A. Background 

As first observed by Cluzel et el. (Cluzel et al, 1996) and Smith et al (Smith et al, 
1996), stretching double stranded DNA is quite different from single-strand DNA. Their 
experiments showed that at a force of around 65-70 pN the DNA sample suddenly snaps 
open (an "overstretching transition"), extending to almost twice its original contour length 
before entering a second entropic stretching regime. This second regime clearly represents a 
"stretched" DNA configuration quite different from ordinary double stranded or B-DNA, and 
has been dubbed S-DNA. The transition from B-DNA into S-DNA is very sharp, indicating 
a high level of cooperativity. 

S-DNA appears to have a definite helical pitch (Leger, 1999; Leger et al, 1999), consis- 
tent with its being a new, double-stranded conformation. An alternative view interprets 
the overstretching transition as force-induced melting (denaturation) of the B-DNA duplex 
(Rouzina and Bloomfield, 2001a,b). One implication of the latter view is that S-DNA should 
have elastic properties similar to those of two single strands, a point to which we will return 
later. 

Whatever view we take of its structural character, the sharpness of the overstretching 
transition is reminiscent of another well-studied structural transition in biopolymers, the 
helix-coil transition (Zimm and Bragg, 1959). Inspired by the classic analysis of Zimm and 
Bragg, this section will model the B^S transition by a two-state (Ising) model living on a 
DPC (the "Ising-DPC model"). We will make no assumptions about the nature of either 
B- or S-DNA. Both are allowed to have arbitrary bend and stretch stiffnesses. Our aim is to 
fit the resulting force-extension curves to the available data and to see whether the values 
of the elastic constants can help characterize the stretched state. (The other state is just 
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double stranded DNA, whose elastic constants are well known.) 



B. General Setup 

Fig. 7 illustrates the model that we will be considering in some more detail. We envision 
a chain consisting of N links, connected by hinges that try to align the segments they join. 
Each segment carries a discrete variable a, which takes the values ±1. We will take a — +1 
to mean the segment is in the B-state and a = — 1 for the S-state. The factor by which a 
segment elongates when going from B to S will be called £, i.e. b s = (b (with ( > 1). We 
assign a bend stiffness parameter A to B-DNA, and a different A s = (3(A to S-DNA; (3 is 
a dimensionless parameter with (3( < 1. We also need to assign a bend stiffness to a hinge 
joining a B and an S segment. This value we will call i]A. 

We can now write down the full energy functional for our Ising-DPC model: 

N-l 



£[{U,(Ti}] 



knT 



= - X|y( (Ti+(Ti + 1 ) +7(^+1-1) + 



fb 



2k B T 



| 1 



A 
2b 



2 2 
(l- ai )(l-a i+1 ) 



Q)U ■ z +{ — o — + — o — 



C)**+i • ^ 



/3 + |(7j-(Tj + i|77 + 



(l + <Ti)(l + (T i+ i) 



(e M+ i) 2 • (25) 



4 i 6 4 

The first line is the pure-Ising part, with 2a /c B T the intrinsic free energy cost of converting 
a single segment from B to S and l^k^T the energy cost of creating a B— »S interface. Note 
that we ignore a contribution to the energy functional from the first and last segments. In 
the long-chain limit this does not affect the outcome of our calculation. 

The partition function for the energy functional (25), 0*}] = 



'7V-1 



n e L d2t < 



= 1 £Ti=±l 



JV-1 

n 

i=i 



-£ , i(*i,0"i,ti+l,cr i+ i)/fcBT 



(26) 



We will again calculate Z with the aid of the transfer matrix technique (Kramers and 
Wannier, 1941), writing Eq. 26 as 



Z = v-J N ~ 1 -w. 



(27) 
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with T now the transfer matrix for our Ising-DPC model, which carries an additional 2-by-2 
structure due to the Ising variables. The dot products are thus defined as 



(J.v) at (U)= I d%WWKft 
The individual matrix elements T aiUj are given explicitly by 

= exp 

Ti_i(fj,f i+ i) = exp 
T- ltl (ii,i i+1 ) = exp 
T_i_i(ti,t i+ i) = exp 



(28) 



1 ~ „ A 
2/(*i+**+i) ' z- — (l-ii ■ + a 

^fik+CU+i) ■ z-^(l-U ■ tVn)-2 7 



1 ~ - 

oC/(*i+*i+l) • 5-— • f i+ i)-Q!o 

2 o 



/6 



where again / _.. , 

Once again we approximate the largest eigenvalue of the transfer matrix T using a vari- 
ational approach, choosing our trial eigenfunctions to possess azimuthal symmetry and to 
be peaked in the direction of the force z. This time, however, we need a three-parameter 
family of trial functions: 



sinh(2o;i) 



1 



\ 



chosen such that their squared norm is independent of all parameters 



(29) 



■>un,oj-i,tp\ 



2tt. 



(30) 



Eq. 29 shows that once again the u;'s gives the degree of alignment of the monomers (how 
forward-peaked their probability distribution is), whereas ip describes the relative probability 
of a monomer to be in the two states. The variational estimate for the maximal eigenvalue 
is now given by 



max y(u, ip) = max 



(31) 



The maximization over tp can be done analytically: defining the 2x2 matrix T(ui,u-i) 



by 



v u1 *,-i,<p ■ T • ii^-i* = (cos <p, sin <p) ■ T(u u 



cos (p 
sin (p 



(32) 
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or equivalently specifying its entries 



i i 

2 



it is easy to show that 



sinh(2o; (T ) 



e J 



(33) 



A max = max — — , (34) 



where y(ui,u>-i) is the maximal eigenvalue of T(ui,U-i). The following subsection will 
calculate this eigenvalue in a continuum approximation to T(ivi,u-i), illustrating the pro- 
cedure by considering in some detail the matrix element Ti^^i, 0J-i). The other matrix 
elements can be obtained analogously. Writing out the integrals explicitly, we have 



smh(2cJi) ./ ;, J S 2 L J 

where we have introduced a = uj\ + |. Condensing notation even further we define /i 2 = 
a 2 + (j) 2 + 2ajii ■ z, which allows us to write 

C. Continuum Limit 

We could now proceed to evaluate the force-extension relation of the Ising-DPC model, by 
generalizing Sect. III. To simplify the calculations, however, we will first pass to a continuum 
limit. To justify this step, note that Fig. 6 shows that the continuum (WLC) approximation 
gives an excellent account of single-stranded DNA stretching out to forces beyond those 
probed in overstretching experiments (about 90 pN). As mentioned earlier, the continuum 
approximation is also quite good for double-stranded DNA, because the latter's persistence 
length is so much longer than its monomer size. 

In the continuum limit b is sent to zero holding L tot fixed; hence N — > oo. The book- 
keeping is more manageable after a shift in ji: 

A , . 

x = fi-j. (37) 
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Eq. 36 then reduces to 



Q() 



(2tt) 2 6 



sinh(2u;i) aA 



-a 



dx exp 



2A 



x 2 + 2x 



a^b 
2A 



ao 



(2n) 2 b 



sinh(2cji) aA 



dxe 2 *(l + X ^)e 



' 2A 



(38) 



The last integral can be worked out exactly, and expanding the result to second order in b 
we end up with 



1 



2tt6 ||u Wl)W _ llV ,|| 2 



Ti,i(cJi)=e 



ceo 



1 + b 



f 



k n T 2A 



coth(2cJi) 



2wi 



(39) 



In similar fashion, we can obtain the following expressions for the other matrix elements. 



A 



1 



2nb ||f Wliti; _ lj¥ ,|| 2 



1 + 6 



A 



1 



2irb\\v II 2 

|| yu)l,U>-l,ip\\ 



fl_l(wl,w_l) = 



-2 7 



rj 



Cf w-i 
k B T 2(3A 



coth(2c<j_ 1 ) 

2cg>_i 

2 sinh(u;i + u;_i 

UJl + CJ_i 



.(40) 



sinh(2a;i) sinh(2c<j_i) 

To obtain a nontrivial continuum limit we must now specify how the parameters A, a , 
and 7 depend on b as 6 — > 0. It is straightforward to show that the choices 



a = — In [3 + ba , 7 = — hi(g&) 



(41) 



work, where we hold A, a, f3 and c/ fixed as b — > 0. With these choices, the matrix 



rT(o;i, takes the form 



v 



1 +, v 2tt6 



1+6 




(42) 



with 



V = a + 
K = -«4 



./ 



kj,T 2A 



coth(2 



2wi 



SL 

k B T 2A(3 



coth(2c<j_ 1 ) — 



1 



2CJ_! 

2 sinh(a;i + 0J-1) 



(43) 



77 \ v sinh(2a;i) sinh(2u>_i) J \ uji + c<j_i 
Note that the prefactor i n Eq. 42 does not contribute to the force-extension result 
Eq. 15, since it does not depend on the force. In terms of the individual matrix entries, the 
quantity to be maximized now reads (see Eq. 31): 

b 



hiyfa, = - (V + U + ^{V - II) 2 + 4Q 2 ) 



(44) 
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Writing Q = b 1 lnAJ nax = b 1 x max a ,lny(a;i,a;_i), the force-extension in the continuum 
limit is finally given by 

(T^) = k » T §- ( 45 ) 

We evaluate Q by numerically maximizing Eq. 44. 

So far, we have not included stretch moduli for the B- and S-DNA. This is easily imple- 
mented to first order in f/E by replacing / with f(l + 2 Js,b ) in the matrix elements for the 
two states respectively (Eq. 29). This procedure yields theoretical force-extension curves 
like the ones plotted in Figs. (8) and (9). 

In summary, our model contains the following seven parameters. 26ik&T is the free 
energy per unit length required to flip B-DNA into the S'-state, and is measured in [J/nm]. 
Q measures the cooperativity of the transition and has units [1/nm]. A is the bend stiffness 
parameter of B-DNA, with units [nm] . The dimensionless parameter (3 is the ratio of the B- 
and S-DNA bend stiffnesses. E B and E s are the stretch stiffnesses of B and S-DNA, and 
are measured in pN. Finally, ( is the dimensionless elongation factor associated with the 
B— >S transition. 



D. Discussion of fits 



Our strategy is now as follows: first, we fit the part of the stretching curve well below 
65 pN to a one-state, continuum model (i.e. to the EWLC), determining its effective spring 
constant and stretch modulus. The values thus obtained are used as initial guesses in a fit 
of the full curve to the Ising-DPC model. To improve convergence, we eliminate two of the 
parameters, as follows. First, we can get an accurate value for E B from the low force data, 
so we hold it fixed to this value during the full fit. Second, as described in Sect. Ill we can 
work out the low-force limit analytically, and from this obtain the effective spring constant 
k as a function of the model's parameters. We invert this relation to get A as a function of 
k and the other parameters. We substitute this A, holding k fixed to the value obtained by 
fitting the low-force data to an EWLC. We then fit the remaining five parameters (/3, Q, a, 
E s and () to the dataset 2 . 

2 In our fits, we exclude the data points in the steepest region of the graph. Because of the inevitable 
scatter in the data and the fact that only the deviations in the y-direction enter into x 2 their residuals 
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The results of the fits obtained in this manner are collected in Figs. (8) and (9). Our Ising- 
DPC hybrid model fits the experimental data rather well, but with so many fit parameters 
one may ask whether the model actually makes any falsifiable predictions. To answer this 
question we note that the data below the transition suffice to fix A and E B as usual, roughly 
speaking from the curvature and slope of the curve below the transition. Similarly, the data 
above the transition fix A s = (f3A and E s . The vertical jump in the curve at the transition 
fixes (. The horizontal location of the jump fixes a, and the steepness of the jump fixes the 
cooperativity Q. 3 Thus all of the model's parameters are fixed by specific features of the 
data. Two additional, independent features of the data now remain, namely the rounding 
of the curve at the start and end of the transition. Our model predicts these features fairly 
succesfully. 

Some common features emerging from the two fits deserve comment. First, both fits 
reproduce the known values for the effective persistence length of B-DNA of around 50 nm 
and its stretch modulus of about 1000 pN. Second, we can read off the bend stiffness of 
S-DNA from our fit as A s = p(A = 12.32 nm (data from Fig. 8) or 7.2 nm (data from 
Fig. 9). If S-DNA consisted of two unbound, single strands, we might have expected A s to 
be twice as large as the value A ss 0.85 nm obtained by fitting the single-strand stretching 
data with the continuum EDPC model (see Fig. 6). On the contrary, we find that the bend 
stiffness of S-DNA is intermediate between that of B-DNA and that of two single strands. 
This conclusion fits qualitatively with some of the structural models of S-DNA, in which the 
bases remain paired but are not stacked as in B-DNA. 

Our third conclusion is that the stretch modulus of S-DNA is substantially higher than 
that of B-DNA. This conclusion is again consistent with the view of S-DNA as stabilized 
mainly by its backbones, which are much straighter than in B-DNA; the contour length of 
B-DNA is instead determined by weaker, base-stacking interactions. 

are overemphasized, hindering convergence and accuracy of the routine. 
3 The fit value of a should be regarded as an average of the two different costs to convert AT or GC pairs. 
The fit value of Q has no direct microscopic significance, as the apparent cooperativity of the transition 
will be reduced by the sequence disorder. 
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E. Relation to prior work 

Polymer models with both finite cutoff and steric hindrances to motion are not new. 
Classical examples include the rotation-isomer models, in which succeeding monomers are 
joined by bonds of fixed polar angle but variable azimuthal angle (Grosberg and Khlokhlov, 
1994). Models of this sort have had some success in making a priori predictions of the persis- 
tence length of a polymer from its structural information, but obtaining the force-extension 
relation is mathematically very difficult. Thus for example (Miyake and Sakakibara, 1962) 
obtain only the first subleading term in the low-force expansion. We are not aware of a prior 
formulation of a model incorporating the microscopic physics of discreteness and stiffness, 
with a detailed experimental test. 

Several authors have also studied the entropic elasticity of two-state chains. As soon as 
the overstretching transition was discovered, Cluzel proposed a pure Ising model by analogy 
to the helix-coil transition (Cluzel, 1996). Others then introduced entropic elasticity, but 
required that both states have the same bending stiffness as B-DNA (Ahsan et al, 1998; 
Marko, 1998) or took one of the two states to be infinitely stiff (Tamashiro and Pincus, 
2001), or to be a FJC (Rouzina and Bloomfield, 2001a,b). We believe our Ising-DPC model 
to be the first consistent formulation incorporating the coexistence of two different states 
with arbitrary elastic constants. Our approach also is calculationally more straightforward 
than some, and minimal in the sense that no unknown potential function needs to be chosen. 

V. STATISTICAL ANALYSIS OF THE B^S TRANSITION 

Using standard techniques from statistical physics, we now look at the B^S transition 
in some more detail. From the expressions for the Ising-DPC hybrid energy functional (25) 
and the partition function (26) we read off that the average "spin" a can be obtained as 



so that for instance the relative population of the S-state (or equivalently the probability to 
find an arbitrary segment in the S-state), P(S), is given by 




(46) 



(47) 
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Similarly, we can take the derivative of Eq. 26 with respect to 7 to determine the average 
nearest neighbor spin correlator 

<^+i> = ~hiZ + l = l- 2bQ A fi . (48) 

The quantity (<7j<7j + i) can be interpreted as the fraction of nearest neighbor pairs in the 
same state minus the fraction of pairs in opposite states. Consequently, the probability of 
having a spin flip at a given site is P(frip) = |(1 — (cfiCfi+i)) and the average number of S+B 
domain pairs is iYp airs = yP(flip). A heuristic measure of the typical S-domain size is then 
(Cantor and Schimmel, 1980) 

ido „ = > m = = (l - P) / (S|g) • (49) 

A^pairs l-(a i a i+1 ) V 9a J \ dQJ 

We wish to highlight two points from this discussion. First, Fig. 10 shows the fraction 
in the S-state, P(S), as a function of the applied force, and we can see the characteristic 
sigmoidal behavior as the system is led through the transition. As the inset demonstrates, 
a small fraction is in the S-state even at zero force. This fraction initially decreases upon 
increasing the stretching force. 4 Fig. 11 plots the typical S-domain length Ldom versus 
applied stretching force. It demonstrates how even well above the transition the S-state on 
average does not persist for very long; at the high end of the physically accessible range of 
forces S-domains measure about 160nm. This figure has some significance as it illustrates 
an important point about the role of nicks in the experiments. Empirically, when working 
with A-phage DNA only around 5% of all samples are completely unnicked (Leger, 1999). 
Since the A-phage genome is about 48Kbp in length, we can roughly estimate the probability 
for an arbitrary base pair to be unnicked is P(not) = (0.05) 1 / 48000 , and consequently the 
probability that a given pair is nicked is P(nick) = 1 — P(not) xs 6.2 • 10 -5 . Given the total 
length of A-phage DNA, this implies we expect there to be an average of 6.2-10" 5 x48-10 3 3 
nicks per sample, corresponding to an average distance between nicks of the order of 5 jum, 
considerably larger than the typical S-domain size. This observation bears on the question 
of the character of the S state of DNA (Rouzina and Bloomfield, 2001a): even if S-DNA were 
a denatured state, the existence of nicks would not necessarily cause it to suffer irreversible 
changes in its elasticity as tracts spanning two nicks fall off during overstretching. 



A related reentrant phenomenon was noted in (Tamashiro and Pincus, 2001). 
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Secondly, different groups have not agreed on whether the stretching curves of double- 
stranded and single-stranded DNA coincide at forces above the former's overstretching tran- 
sition (Bustamante et al, 2000; Leger, 1999). We wish to point out that even if S-DNA were 
a denatured state, we still would not necessarily expect these two curves to coincide. Fig. 10 
shows that the conversion from B- to S-form continues well beyond the apparent end of the 
force plateau, continuing to affect the force-extension curve. To determine whether S-DNA 
is elastically similar to B-DNA one must disentangle the two states' contributions to the 
stretching curve by globally fitting to a 2-state model, as we have done. 

VI. CONCLUSION 

Sect. I summarizes our conclusions. Here we list a number of interesting modifications 
to the model, as possible extensions to this work. 

While the variational approximation used here has proved to be adequate, still it is 
straightforward to replace it by the eigenfunction-expansion technique, which can be carried 
to arbitrary accuracy (Marko and Siggia, 1995). Similarly, the methods of Sect. Ill can 
be used to work in the full, discrete DPC model instead of the continuum approximation 
used in Sect. IV. C. It is also straightforward to retain finite- length effects, by keeping the 
subleading eigenvalue of the transfer matrix. 

Real DNA is not a homogeneous rod. The methods of quenched disorder can be used 
to introduce sequence- dependent contributions to the transition free energy a and the bend 
stiffness A. Finally, we believe that the methods of this paper can be adapted to the study 
of the stretching of individual polypeptide and polysaccharide molecules (Rief et al, 1998). 
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APPENDIX A: Derivation of y(uj), the variational approximation to A max . 

In this appendix we will derive an expression for y{uj) as defined in Eq. 18, which we 
recall reads 



|ty 2 y(cj) = v u -T-v u . 



(Al) 



We will assume that the angles 0j,j+i between successive links are small, which allows us 
to replace (Q iji+ i) 2 = arccos 2 (tj • t i+1 ) by its small-angle approximation 2(1 — tj • t i+1 ). The 
family of trial functions we use is parameterized by the single parameter u>; v u (t) = e w *' z . 
Furthermore, we will ignore the two contributions from the beginning and end of the chain 
(appearing for instance in Eq. 9), as they do not contribute to our result in the long chain 
limit anyway. Thus the energy functional is 

JV-1 



knT 



-E 

1=1 



fb r , - ^ A , , 



According to Eq. 13, the matrix elements of T are given by 



T(ii,i i+1 ) = exp 



~ f . ~. . 

-e + Uti + ti +1 )-z + eti-ti +1 



where we use the dimensionless force / = JXf and ratio of characteristic lengths 



Working out the scalar products in Eq. Al yields 



v 0J \\ 2 y{u) = e 1 \ d 2 ii / d 2 i i+1 exp 



- + u)(ti + t i+1 ) ■ z + eti- t i+1 



Defining an auxiliary vector 



G = { f - + oo)z + iu = Gg, 



with 



G = \\G\\ = I ({- + uj) 2 + I 2 + i(f + 2lu) U ■ z 



(A2) 



(A3) 



A 
b ■ 



(A4) 



(A5) 



(A6) 



simplifies Eq. A4, which now reads 



\v^\\ 2 y(uj) = e 1 I d 2 t;exp 

,2 



(- + u)U-z 



d 2 i i+1 exp [G g ■ U+i] 



(AT) 
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Transforming to spherical polar coordinates with g as the polar axis, the second integral can 
be worked out to give ^sinh(G). Since the integral over tj involves only terms containing 
ti ■ z, the integration over the azimuthal angle simply yields 2n. For the polar angle, we 
change the integration variable to G (which is a monotonic function of U ■ z), bringing it to 
the following form 

2" 



l*U| 2 2/M 



16tt 2 



£(f + 2u) 



exp 




dG exp G 2 /2£ sinh(G) . 



(A8) 



The integral over G can be performed analytically, and is most conveniently expressed in 
terms of error functions as 



dG exp 



G 2 /2i sinh(G) 



2y/2 



Erf 



= (f + A£ + 2u) -Erf 



= (f-4£ + 2u) 



.(A9) 



2V2£ J \2V2e 

This expression is valid only in the regime where £ > | + ou, which is satisfied as long as 
one chooses A > b. Note that the error functions have imaginary arguments. Using the 
normalization quoted in Eq. 17 we can now express y(uS) in a form that is well suited for 
further (numerical) manipulations: 



2V2^' 2 uje 



-2£- 



si csch(2o;) 



x 



x 



Erf 



£(2u + /) 
(f + A£ + 2u) 



Erf 



2V2£ 



Cf-U + 2u) 



(A10) 
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FIG. 1 The Freely Jointed Chain consists of identical segments of length 6, joined together by 
free hinges. The configuration is fully described by the collection of orientation vectors {U}. 
denotes the angle between ti and the fixed direction z of the applied stretching force. 




FIG. 2 A Worm Like Chain is a continuum elastic medium, whose configuration is described in 
terms of the position vector r as a function of the contour length s. 
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FIG. 4 Comparison between the exact WLC force-extension solution and the Ritz variational 
approximation. The deviation dev(/) is defined as 100% x (z(/) e xact — z(f) vai )/z(f) exac t, with / 
the dimensionless force / = -j^p- The maximal error induced by the variational approximation is 
about 1%. Data for the exact solution were taken from (Bouchiat et ai, 1999). 
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FIG. 5 Least-squares fit (solid line) of the single-stranded DNA stretching data (closed circles) 
from (Rief et al, 1999) to the extensible FJC model. Included in the fit are the data up to a force 
of 100 pN. Fitting only those data points yields a link length b = 1.75 nm and a stretch modulus 
E = 8 • 10 2 pN, reproducing the typical values as cited for instance in (Clausen-Schaumann et al, 
2000; Hegner et al, 1999; Rief et al, 1999). In this graph, we have extrapolated this fit to the 
high-force range, to demonstrate that the parameters as extracted from the low-force data do not 
represent the full range of data faithfully. 
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FIG. 6 Fit of the extensible DPC model (solid line) to the single-strand DNA stretching data 
(circles) kindly supplied by M. Rief; see (Rief et al, 1999). The fit shown was obtained for 
b = 0.17, E = 4.5 • 10 3 pN,L tot = 3.9 /im, and k dpc = 1 sf nm - In addition, the dashed and 
dotted lines show the corresponding best fits to the extensible FJC and WLC, respectively. All 
fits include the data points only for forces between 20 pN and 400 pN. Values for \ 2 were EFJC : 
X 2 = 1.269; EWLC : X 2 = 0.600 and EDPC : X 2 = 0.490 at iV = 1523. We ignore the lowest-force 
points because of complications induced by hairpins and other secondary structures in the DNA. 
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FIG. 7 Conventions for the Ising-DPC model. We take a = +1 to correspond to B-DNA, and 
a = -1 to S-DNA. Each segment of S-DNA is longer than B-DNA by a factor (. Definitions of i, 9 
and are the same as before. 
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FIG. 8 Least-squares fit of the Ising-DPC model to an overstretching dataset (48.5 kbp A DNA 
construct; buffer 500 mM NaCl, 20 mM Tris, pH 8). Data kindly supplied by C. Bustamante and 
S. Smith. Fit parameters: k dpc = 3 *f r 43.75 nm '" = 5A5 >@ = °- 16 >2 = °- 13 >C = 1-76, E B = 
1.2 • 10 3 pN and E s = 1.0 • 10 4 pN. X 2 = 9.22 at N = 825, points with 1.11 < (f ) < 1.55 were 
excluded from the fit. For further discussion see Sect. IV. D. 
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FIG. 9 Least-squares fit of the Ising-DPC model to an overstretching dataset obtained from a 
15.1 fim sample of EMBL3 A DNA in phosphate-buffered solution (lOOmM; 80mM Na + and 0.01% 
Tween) from (Cluzel et al., 1996). Data kindly supplied by J. Marko. Fit parameters: k dpc = 
52.63 nm '«o = 4.82 nm" 1 , /3 = 0.08, Q = 0.23, C = 1.71, E B = 7.3-10 2 pN and E s = 3-10 4 pN. 
X 2 = 2.15 at N = 339, points with 1.15 < (f) < 1.5 were excluded from the fit. For further 
discussion see Sect. IV. D. 
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FIG. 10 P(S), the relative population of the S-state, versus the applied stretching force, as calcu- 
lated from Eq. 47. The inset shows that the S-state has a nonzero population even at zero force. 
Parameter values are those from Fig. 9. 
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FIG. 11 The typical length of an S-domain -Ldom vs - the stretching force, calculated using Eq. 49. 
Parameter values are those of Fig. 9. The asymptotic slope of the linear increase has been 
determined to be 3.15nm pN -1 . Note, that even at 120pN, the typical size of an S-domain is only 
160nm, or about 480 basepairs. 
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